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Abstract 

The information geometry of the 2-manifold of gamma probability 
density functions provides a framework in which pseudorandom number 
generators may be evaluated using a neighbourhood of the curve of ex- 
ponential density functions. The process is illustrated using the pseudo- 
random number generator in Mathematica. This methodology may be 
useful to add to the current family of test procedures in real applications 
to finite sampling data. 



1 Introduction 

The smooth family of gamma probability density functions is given by 



/ : [0,oo) -> [0,oo) : x i-> w fJ,,K>0. (1) 

Here \i is the mean, and the standard deviation <r, given by n = (^) 2 , is pro- 
portional to the mean. Hence the coefficient of variation ^ is unity in the case 

that ([I]) reduces to the exponential distribution. Thus, k = 1 corresponds to an 
underlying Poisson random process complementary to the exponential distribu- 
tion. When k < 1 the random variable X represents spacings between events 
that are more clustered than for a Poisson process and when n > 1 the spacings 
X are more uniformly distributed than for Poisson. The case when fi — n is a 
positive integer and n — 2 gives the Chi-Squared distribution with n — 1 degrees 
of freedom; this is the distribution of ("~^) s f r variances s 2 of samples of size 
n taken from a Gaussian population with variance cr^.. 

The gamma distribution has a conveniently tractable information geome- 
try [U[2], and the Ricmannian metric in the 2-dimensional manifold of gamma 
distributions ([lj is 



[9ij\ (M) k ) = = 



4r 

s&iog(r) 



(2) 



So the coordinates (/i, k) yield an orthogonal basis of tangent vectors, which is 
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Figure 1: Maximum likelihood gamma parameter k fitted to separation statistics 
for simulations of Poisson random sequences of length 100000 for an element 
with expected parameters (/z, k) = (511,1). These simulations used the pseudo- 
random number generator in Mathematica ^Jj. 



useful in calculations because then the arc length function is simply 
We note the following important uniqueness property: 



Theorem 1.1 (Hwang and Hu |4j) For independent positive random vari- 
ables with a common probability density function f, having independence of the 
sample mean and the sample coefficient of variation is equivalent to f being the 
gamma distribution. 

This property is one of the main reasons for the large number of applications 
of gamma distributions: many near-random natural processes have standard 
deviation approximately proportional to the mean . Given a set of identically 
distributed, independent data values X\, X^, . . . , X n , the 'maximum likelihood' 
or 'maximum entropy' parameter values (i, k for fitting the gamma distribu- 
tion (JT|) are computed in terms of the mean and mean logarithm of the X, by 
maximizing the likelihood function 



n 

L f (n,K) = Y[f(Xi;n,K). 

i=l 
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By taking the logarithm and setting the gradient to zero we obtain 

1 ™ 

A = X=-^X, (3) 



n 

i=l 



v ' i=l 



= logX-logA. (4) 

2 Neighbourhoods of randomness in the gamma 
manifold 

In a variety of contexts in cryptology for encoding, decoding or for obscuring 
procedures, sequences of pseudorandom numbers are generated. Tests for ran- 
domness of such sequences have been studied extensively and the NIST Suite 
of tests [5] for cryptological purposes is widely employed. Information theoretic 
methods also are used, for example see Grzegorzewski and Wieczorkowski [3] 
also Ryabko and Monarev [3] and references therein for recent work. Here we 
can show how pseudorandom sequences may be tested using information geom- 
etry by using distances in the gamma manifold to compare maximum likelihood 
parameters for separation statistics of sequence elements. 

Mathematica [7] simulations were made of Poisson random sequences with 
length n = 100000 and spacing statistics were computed for an element with 
abundance probability p = 0.00195 in the sequence. Figure [T] shows maximum 
likelihood gamma parameter k data points from such simulations. In the data 
from 500 simulations the ranges of maximum likelihood gamma distribution 
parameters were 419 < fx < 643 and 0.62 < k < 1.56. 

The surface height in Figure [2] represents upper bounds on information ge- 
ometric distances from (/i, k) = (511,1) in the gamma manifold. This employs 
the geodesic mesh function we described in Arwini and Dodson [2J. 



Distance[(5ll, 1), (//, k)} < 



d 2 logT d 2 logT 





1 511 


+ 


log 




1" 



(5) 



Also shown in Figure [2j are data points from the Mathematica simulations 
of Poisson random sequences of length 100000 for an element with expected 
separation 7 = 511. 

In the limit, as the sequence length tends to infinity and the abundance 
of the element tends to zero, we expect the gamma parameter r to tend to 1. 
However, finite sequences must be used in real applications and then provision of 
a metric structure allows us, for example, to compare real sequence generating 
procedures against an ideal Poisson random model. 
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Figure 2: Distances in the space of gamma models, using a geodesic mesh. The 
surface height represents upper bounds on distances from (/i, k) = (511, 1) from 
Equation Q). Also shown are data points from simulations of Poisson random 
sequences of length 100000 for an element with expected separation fi = 511. 
In the limit as the sequence length tends to infinity and the element abundance 
tends to zero we expect the gamma parameter k to tend to 1. 
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